AWS Textract OCR
AutomatR.Windows.Activities.AWSTextractOCR
The "AWS Textract OCR" activity in AutomatR integrates with Amazon Textract, a fully managed OCR service provided by Amazon Web Services (AWS). This activity facilitates the extraction of text and data from images or documents using the powerful OCR capabilities of AWS Textract.
Properties
Name | Description |
---|---|
Input | |
Access Key | Specifies the AWS Access Key ID associated with your AWS account. This key is used for authentication to access the Textract OCR service. String variables containing the AWS Access Key ID. |
Bucket Name | Specifies the name of the Amazon S3 bucket where the input document or image is stored. Textract processes documents stored in this bucket. String variables containing the bucket name. |
File Name | Specifies the name of the document or image file stored in the specified Amazon S3 bucket. Textract performs OCR on this document. String variables containing the file name. |
File Path | Specifies the local path to the document or image file if it is stored locally. This property is used if the file is not in an Amazon S3 bucket. String variables containing the local file path. |
Region | Specifies the AWS region where the Amazon Textract service is hosted. String variables containing the AWS region. |
Region Selection | Allows the user to select the image region to capture by clicking on the ellipsis button (...) and dragging the mouse to define the region of interest. This is particularly useful when focusing OCR on specific areas of an image. No direct variable support for region selection, as it involves user interaction. |
Secret Key | Specifies the AWS Secret Access Key associated with your AWS account. This key is used for authentication to access the Textract OCR service. String variables containing the AWS Secret Access Key. |
Misc | |
Display Name | The display name of the activity. A display name is automatically generated when you indicate a target. |
Optional | |
Delay | Specifies the amount of time (in seconds) to wait before executing the Textract OCR activity. This can be useful for handling synchronization issues. Integer variables containing the delay duration. Ex.: If the amount of time is 1000 milliseconds or 1 sec, i.e. 1. |
Output | |
Result | Outputs the result of the AWS Textract OCR operation, typically containing the extracted text data and additional information about the document. Variables of relevant types (e.g., string variables) to store the OCR result. |
How to use:
- Drag and drop the "AWS Textract OCR" activity onto the workflow.
- Configure the properties by providing the necessary AWS credentials, file information, and region details.
- Use the region selection feature to define the area of interest within the image.
- Optionally, configure the delay and customize the display name.
- Execute the workflow to perform OCR using the AWS Textract service.
Note: Ensure that the specified AWS credentials (Access Key, Secret Key) have the necessary permissions to interact with the Amazon Textract service.
Example: Consider an example where the "AWS Textract OCR" activity is used to extract text from an image stored in an Amazon S3 bucket:
AWS Textract OCR:
Display Name: "Extract Text from Image"
Access Key: "your_access_key"
Secret Key: "your_secret_key"
Bucket Name: "your_s3_bucket"
File Name: "sample.png"
Region: "us-east-1"
Region Selection: [User Interaction]
Result: extractedText
In this example, the activity uses the AWS Textract service to extract text from the "sample.png" image file stored in the specified Amazon S3 bucket. The region of interest is interactively defined by the user through the region selection feature. The extracted text is stored in the variable "extractedText" for further use in the workflow.